A Variable Selection Criterion for Linear Discriminant Rule and its Optimality in High Dimensional Setting

نویسندگان

  • Masashi Hyodo
  • Tatsuya Kubokawa
چکیده

In this paper, we suggest the new variable selection procedure, called MEC, for linear discriminant rule in the high-dimensional setup. MEC is derived as a second-order unbiased estimator of the misclassification error probability of the linear discriminant rule. It is shown that MEC not only decomposes into ‘fitting’ and ‘penalty’ terms like AIC and Mallows Cp, but also possesses an asymptotic optimality in the sense that MEC achieves the smallest possible conditional probability of misclassification in candidate variable sets. Through simulation studies, it is shown that MEC has good performances in the sense of selecting the true variable sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Selection in High-Dimensional Classification

High-dimensional discriminant analysis is of fundamental importance in multivariate statistics. Existing theoretical results sharply characterize different procedures, providing sharp convergence results for the classification risk, as well as the l2 convergence results to the discriminative rule. However, sharp theoretical results for the problem of variable selection have not been established...

متن کامل

Asymptotic optimality of a cross-validatory predictive approach to linear model selection

Abstract: In this article we study the asymptotic predictive optimality of a model selection criterion based on the cross-validatory predictive density, already available in the literature. For a dependent variable and associated explanatory variables, we consider a class of linear models as approximations to the true regression function. One selects a model among these using the criterion unde...

متن کامل

A linear constrained distance-based discriminant analysis for hyperspectral image classification

Fisher's linear discriminant analysis (LDA) is a widely used technique for pattern classi"cation problems. It employs Fisher's ratio, ratio of between-class scatter matrix to within-class scatter matrix to derive a set of feature vectors by which high-dimensional data can be projected onto a low-dimensional feature space in the sense of maximizing class separability. This paper presents a linea...

متن کامل

Asymptotic optimality of sparse linear discriminant analysis with arbitrary number of classes

Many sparse linear discriminant analysis (LDA) methods have been proposed to overcome the major problems of the classic LDA in high-dimensional settings. However, the asymptotic optimality results are limited to the case that there are only two classes, which is due to the fact that the classification boundary of LDA is a hyperplane and explicit formulas exist for the classification error in th...

متن کامل

The subselect R package

The subselect package addresses the issue of variable selection in different statistical contexts, among which exploratory data analyses; univariate or multivariate linear models; generalized linear models; principal components analysis; linear discriminant analysis, canonical correlation analysis. Selecting variable subsets requires the definition of a numerical criterion which measures the qu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013